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Dcocription 

Attorney Docket No. 38195.73 

A VIDEOPHONE INTERPRETATION SYSTEM AND A VIDEOPHONE 
INTERPRETATION METHOD 

TECHNIC A fcBACKGROUND OF THE INVENTION 
1 . Fiel d of the Invention 

The present invention relates to a videophone 
interpretation system and a videophone interpretation method 
which provide an interpretation service for a conversation with 
a videophone between persons ^A&jrfig ^speaking different languages, 
and in particular_^ to a videophone interpretation system and 
a videophone interpretation method which provide 
administration services_j_suchas those of f eredby apublic office, 
a hospital and a police station, to a foreigner who is incapable 
of using the local language, without an interpreter being 
rosidont p resent in aa -the administrative be^bodies mentioned 
above . 

BaGkgroun d 2 . Description of the Related Art 

In recent years, persons in remote locations have come 
^fee-converse with each other at a practical level-by^ using a 
videophone, on the strength of dovclopmcnt of due to developments 
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in communications technologies. In order for persons who 
ue espeak different languages to effectively converse omoothly 
with each other, an interpreter is required. It is thus desired 
that an interpretation service with a videophone will become 
widely available. 

In the prior art, in order to obtain an interpretation 
service with a videophone, a three-way call had to m ust b e 
established by using a multipoint conferencing unit offering 
a teleconference service between a caller wishing who wants to 
have a conversation, a callee as a conversation partner, and 
an interpreter who interprets between a language used by the 
caller and a language used by the callee. 

Fig. 22 shows a prior art configuration whereby an 
interpretation service is obtained by using a video conference 
service with a multipoint conferencing unit. In Fig. 22, a 
numeral 10 represents a videophone terminal for the caller 
(hereinafter referred to as a caller terminal) , numeral 20 
represents a videophone terminal for the callee (hereinafter 
referred to as a callee terminal) , numeral 30 represents a 
videophone terminal for the interpreter (hereinafter referred 
to as an interpreter terminal) , numeral 50 represents a public 
telephone line, and toumeral 1 represents a multipoint 
conferencing unit. Each videophone terminal compriaco 
includes a camera (a) for picking up the user, a display (b) 
for displaying a received video, a dial pad (c) for dialing 
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the number of a distant party, a headset (d) including a 
microphone for acquiring the voice of the user and listening 
to the received audio. The multipoint conferencing unit 1 
offers a videoconferencing service and h£b eincludes a function 
to accept a call from a reserved terminal, oynthooizing vidcoo 
and Qudioo to synthesize video and audio transmitted from the 
terminals connected and transmitting to each terminal the 
synthesized video and audio. 

Next, the procedure used to obtain an interpretation 
service using the multipoint conferencing unit will be described 
First, a caller searches for and calls an interpreter who is 
capable of interpreting between the language used by the caller 
and that used by the callee . Next , the called interpreter calls 
the callee based on the request made by the caller and determines 
a conversation date / and time . When the conversation date/ 
and time is determined, the caller reserves a-teleconf erencing 
at the multipoint conferencing unit 1. The caller, the callee 
and the interpreter check in to the multipoint conferencing 
unit 1 with respective videophone terminal terminals by using 
the specified login information when the reserved date / and 
time is reached . This otarto a b egins teleconferencing between 
the caller terminal 10 , callee terminal 20 and the interpretation 
terminal 3 0 . On the display of each terminal, ar-video obtained 
by synthesizing the video of the caller, the video of the callee 
and the video of the interpreter is displayed. To the earphone 
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of the headset of each terminal, afi — audio obtained by 
synthesizing the audio of the caller, the audio of the callee 
and the audio of the interpreter is output. Thus, the caller 
and the callee can have a videophone conversation while obtaining 
interpretation by the interpreter. 

In such a prior art videophone interpretation service 
using a multipoint conferencing unit , it is necessary to reserve 
a teleconference on the multipoint conferencing unit before 
starting a videophone conversation, and the caller had to m ust 
search for an interpreter and^ contact the callee and hold 
consultation to set a videoconf erence in advance. 

Thus, it has been difficult to apply this approach to 
an interpretation service which requires urgent immediate 
support^ such as in caoc where a foreigner who is incapable 
of using the local language urgently obtaina needs to obtain 
an administration service from a public office, a hospital or 
a police station. The interpreter must join from the stage 
of prior consultation between the caller and the callee. This 
roGtraino occupies the interpreter for a long time ge -such that 
the interpretation service cost rioca high increases . 

Thus , — a main object 

SUMMARY OF THE INVENTION 

To overcome the problems described above, preferred 

embodiments of the invention -irs — fee — ^provide a videophone 
interpretation system and a videophone interpretation method 



4 



which eliminates the need for a caller to search for an 
interpreter and consult with a callee in advance_^ and which 
are available alao in an emergency, thereby minimizing the 
rootraint time required of the interpreter and reducing the 
interpretation' service cost. 

DiocloaurQ of the Invention 

A videophone interpretation system according to claim 
1 io Q videophone interpretation a preferred embodiment of the 
present invention is a system where in which an interpreter 
interprets a videophone conversation between a caller and a 
callee using who speak different languages, the videophone 

interpretation system comprising preferably includes 

connection means for connecting a caller terminal, a callee 
terminal and an interpreter terminal, and communication means 
for communicating ar-video and aH^audio between the terminals 
connected by the connection means , characterized in that wherein 
the connection means includes an interpreter registration table 
where in which at least the language types that are interpretable 
by an interpreter and the terminal number of the interpreter 
are registered, a function to accept a call from a caller terminal, 
a function to acquire the terminal number of a callee, language 
type of the caller and the language type of the callee from 
the caller terminal for which the call was accepted, a function 
to extract the terminal number of the interpreter by referencing 
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the interpreter registration table from the acquired language 
type of the caller and language type of the callee, a function 
to call the interpreter terminal fey— using ^-h ethe extracted 
terminal number of the interpreter oxtractod , and a function 
to call the callee terminal by using the acquired terminal number 
of the callee and that the communication means transmits ar-video 
including at least ar-video from the callee terminal and an audio 
including at least an audio from the interpreter terminal to 
the caller terminal, a function to transmit ar-video including 
at least a-video from the caller terminal and an audio including 
at least an audio from the interpreter terminal to the callee 
terminal , and a function to transmit an audio including at least 
an audio from the caller terminal and an audio from the callee 
terminal to the interpreter terminal . 

With this configuration, upon when a call is made from 
a caller terminal, the terminal number of an interpreter capable 
of interpreting between the language of the caller and the 
language of the callee is extracted from the interpreter 
registration table, and the caller terminal , the callee terminal 
and the interpreter terminal are automatically connected, and 
a-video and an audio required for interpretation are communicated 
The caller need not previously search for an interpreter and 
hold consultation with the callee^ thus providing a videophone 
interpretation service which may be is available even in an 
emergency. . The interpreter can join a videophone conversation 
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anywhere he/she may be, as long as he/she can be called. This 
minimizes the rcatraint time e ^neededby the interpreter, €bR dand 
thus, reduces the interpretation service cost. 

The — In the v ideophone interpretation system 

according to claim 2 io the videophone interpretation system 
according to claim 1, — characterized in that the preferred 
embodiments of the present invention, the communication means 
preferably includes a function to transmit a—video obtained 
by synthesizing o-video from the callee terminal as a main window 
and ar-video from the interpreter terminal as a sub window to 
the caller terminal, a function to transmit a^-video obtained 
by synthesizing a-video from the caller terminal as a main window 
and ar-video from the interpreter terminal as a sub window to 
the callee terminal , and a function to transmit ar-video obtained 
by synthesizing ar-video from the caller terminal and ar-video 
from the callee terminal to the interpreter terminal. — 

This allowa enables the caller and the callee to check 
the expression of the interpreter in a Picture- in-Picture 
fashion -s esuch that it is caa y easier to understand the voice 
of the interpreter. The interpreter can check the expression 
of the caller and the expression of the callee -sosuch that a 
precise interpretation is enabled. 

The videophone interpretation oyotcm according to claim 
3 ia in the videophone interpretation system according to claim 
i — eg— 2 -pref erred embodiments of the present invention , the 
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communication means including a p referably includes a first 
audio transmission function to synthesize aR-audio from the 
callee terminal and aHr-audio from the interpreter terminal and 
transmit the resultaat to the caller terminal, a second audio 
transmission function to synthesize an andio audio from the 
caller terminal and aH-audio from the interpreter terminal and 
transmit the resultaat to the callee terminal, a third audio 
transmission function to synthesize afi-audio from the caller 
terminal and an— audio from the callee terminal and transmit 
the resultant to the interpreter terminal, and an unnecessary 
side audio suppression function to suppress an unnecessary side 
audio of either an-audio from the. interpreter terminal supplied 
to the first audio transmission function or aft-audio from the 
interpreter terminal supplied to the second audio transmission 
function based on a command from the interpreter terminal, 
characterized — in that wherein the first audio transmission 
function includes a callee audio suppression function to 
suppress an-audio from the callee terminal when aft-audio from 
the interpreter terminal is detected and that the second audio 
transmission function includes a caller, audio suppression 
function to suppress aft— audio from the caller terminal when 
aft— audio from the interpreter terminal is detected. 

In an interpretation with In interpretations using 

a prior art videoconf erence, aft-audio obtained by synthesizing 
the audios of the three parties is transmitted to each terminal . 
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Thus, when a user at a terminal speaks while a user at any other 
terminal is speaking, the content of the conference is difficult 
to understand — with — each — othor . Thus, the interpreter 
awaits waits until the completion of the speech. of the caller 
before interpretation, a callee awaita waits until the 
completion of the interpretation before speech, and the 
interpreter await s waits until the completion of the speech of 
the callee before interpretation. Since such a procedure must 
be repeated in a conference, it has been difficult to perform 
a opocd y quick and precise interpretation. According to 
fefe epreferred embodiments of the present invention, the 
unnecessary side audio suppression function suppresses an 
unnecessary side transmission of oHr-audio of the interpreter 
to either the caller or the callee, based on a command from 
the interpreter terminal . When the audio of the interpreter 
is detected, transmission of the original audio of the callee 
to the caller is suppressed by the callee audio suppression 
function. When the audio of the interpreter is detected, 
transmission of the original audio of the caller to the callee 
is suppressed by the caller audio suppression function. With 
these functions, the caller and the callee can graap understand 
the interpretation even when their speech overlap that of the 
interpreter, thereby providing a opcody for quick and precise 
videophone interpretation service. 

The suppression includes a case where the level of an 
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audio signal is lower c d reduced in order to allow hearing to 
some extent and a case where the audio signal is sfe ^completely 
turned off so as to mute the audio. The unnecessary audio 
suppression function includes a case where the audio of the 
interpreter is transmitted selectively to either the caller 
or the callee. 

The videophone interpretation ayotcm according to claim 
4 io ln the videophone interpretation system according to claim 
i — ea? — 2 -preferred embodiments of the present invention , the 
communication means preferably including a first audio 
transmission function to selectively transmit either an^audio 
from the callee terminal or an audio from the interpreter 
terminal to the caller terminal, a second audio transmission 
function to selectively transmit either anraudio from the caller 
terminal or aH-audio from the interpreter terminal to the callee 
terminal, a third audio transmission function to synthesize 
an audio from the caller terminal and an-audio from the callee 
terminal and transmit the resulta^ to the interpreter terminal , 
and an unnecessary side audio suppression function to suppress 
an unnecessary side audio of either anraudio from the interpreter 
terminal supplied to the first audio transmission function or 
aR-audio from the interpreter terminal supplied to the second 
audio transmission function by a command from the interpreter 
terminal, characterized — ±n — that wherein the first audio 
transmission function includes a function to shut turn off aft 



10 



audio from the callee terminal and transmit aHr-audio from the 
interpreter terminal when aft— audio from the interpreter is 
detected and that the second audio transmission function 
includes a function to shut turn off as^audio from the caller 
terminal and transmit aH^audio from the interpreter terminal 
when aft— audio from the interpreter terminal is detected. 

According to feh epreferred embodiments of the present 
invention, the unnecessary side audio suppression function 
suppresses an unnecessary side transmission of afi-audio of the 
interpreter to either the caller or callee, based on a command 
from the interpreter terminal . When aRraudio of the interpreter 
is detected in the first audio transmission function, the 
original audio of the callee switches to the audio of the 
interpreter. When asr-audio of the interpreter is detected in 
the second audio transmission function, the original audio of 
the callee switches to the audio of the interpreter. With these 
functions, the caller and the callee can graop u nder stand the 
interpretation even when their speech overlap that of the 
interpreter, thereby providing a spcody quick and precise 
videophone interpretation service. 

The unnecessary audio suppression function includes a 
case where in which the audio of the interpreter is transmitted 
selectively to either the caller or the callee. 

The videophone interpretation ayatcm according to claim 
5 io ln the videophone interpretation system according to claim 
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1 or 2 , GharactQrizQd in that p ref erred embodiments of the present 
invention, the communication means preferably includes a first 
audio transmission function to perform audio multiplexing eft 
aftof audio from the callee terminal and an audio from the 
interpreter terminal and transmit the resultaiife- to the caller 
terminal , a secondaudio transmission function to per form-s- audio 
multiplexing on an of audio from the caller terminal and aft-audio 
from the interpreter terminal and transmit the resultant- to 
the callee terminal, a third audio transmission function to 
performaudiomultiplexingefi-aftof audio from the caller terminal 
and aft-audio from the callee terminal and transmit the resultant 
to the interpreter terminal, and an unnecessary side audio 
suppression function to suppress an unnecessary side audio of 
either €tft-audio from the interpreter terminal supplied to the 
first audio transmission function or an^ — audio from the 
interpreter terminal supplied to the second audio transmission 
function, based on a command from the interpreter terminal. 
According to t^ ^preferred embodiments of the present 
invention, the unnecessary side audio suppression function 
suppresses an unnecessary side transmission of an^audio of the 
interpreter to either the caller or callee, by a command from 
the interpreter terminal. In the first audio transmission 
function, the original audio of the callee and the audio of 
the interpreter are multiplexed and the resultant is transmitted 
to the caller. In the second audio transmission function, the 
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original audio of the caller and the audio of the interpreter 
are multiplexed and the resultaafe- is transmitted to the callee . 
With these functions, the caller and the callee can 
^3?a-s punde r s t and the interpretation even when their speech 
overlap that of the interpreter, thereby providing a gpcod y quick 
and precise videophone interpretation service. 

The unnecessary side audio suppression function includes 
a case where the audio of the interpreter is selectively 
transmitted to either the caller or callee. 

The videophone interpretation system according to claim 
S is in the videophone interpretation system according to a^iy 
OR epref erred embodiments of claimo 1 through 5, charactorigcd 
ift — that — the present invention, the communication means 
preferably includes a function to record €e-video including a 
video from the caller terminal, arvideo from the callee terminal 
andarvideo from the interpreter terminal andaH-audio including 
€tH— audio from the caller terminal, etR— audio from the callee 
terminal and aa— audio from the interpreter terminal, and a 
function to reproduce and transmit the recorded video and audio 
by a request from a terminal. 

With this configuration, videos v ideo and audio a audio 
from the caller, the callee and the interpreter in an 
interpretation service are recorded, A&Since the details of 
recording can be checked by a request from a terminal, it is 
possible to review the contents which were not clear eftat the 
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opot time of the conversation or to check later the details 
of the communications service at a later time . 

A video Video m ay be recorded by recording a 

synthesized video of a— video to be transmitted to a caller 
terminal and a-video to be transmitted to a callee terminal. 
By doing so, it is possible to check the video received by the 
caller or callee. 

An audio Audio m ay be recorded by recording aHr-audio 

obtained by performing audio multiplexing on an— audio to be 
transmitted to a caller terminal and ase-audio to be transmitted 
to a callee terminal. By doing so, it is possible to check 
the contents by ^in the language of the caller and that in the 
language of the callee separately from a terminal equipped with 
an audio demultiplexing function. 

O^, — aft Al t erna t ive ly , audio to be transmitted to 

a caller terminal and afi-audio to be transmitted to a callee 
terminal may be recorded separately and the audio of a side 
specified baocd on by a command from a terminal may be reproduced 
for transmission. By doing so, it is possible to check the 
contents feyin the language of the caller and that in the language 
of the callee separately even from a terminal not equipped with 
an audio demultiplexing function. 

A videophone interpretation system according to claim 
7 io a videophone interpretation p referred embodiments of the 
present invention is a system where a videophone conversation 
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between a caller and a callee using different languages is 
interpreted by a first interpreter who interprets the language 
of the callee to the language of the caller and a second 
interpreter who interprets the language of the caller into the 
language of the callee, the videophone interpretation system 
compriging p referably includes connection means for connecting 
a caller terminal, a callee terminal, a first interpreter 
terminal and a second interpreter terminal and communication 
means for communicating ar— video and an audio between the 
terminals connected by the connection means, charactcrigQd in 
that wherein the connection means includes an interpreter 
registration table where at least the language types 
interpretable by an interpreter and the terminal number of the 
interpreter are registered, a function to accept a call from 
a caller terminal, a function to acquire the terminal number 
of a callee, language type of the caller and the language type 
of the callee from the caller terminal for which the call was 
accepted, a function to extract the terminal number of the first 
interpreter by referencing the interpreter registration table 
from the acquired language type of the callee and language type 
of the caller, a function to call the first interpreter by using 
the terminal number of the interpreter extracted, a function 
to extract the terminal number of the second interpreter by 
referencing the interpreter registration table from the 
acquired language type of the caller and language type of the 
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callee, a function to call the second interpreter by using the 
terminal number of the interpreter extracted, and a function 
to call the callee terminal by using the acquired terminal number 
of the callee, and that the communication means includes a 
function to transmit ar-video including at least a-video from 
the callee terminal and afi-audio including at least a«:-audio 
from the first interpreter to the caller terminal, a function 
to transmit ar-video including at least a-video from the caller 
terminal andaH-audio including at least afiraudio from the second 
interpreter to the callee terminal, a function to transmit aa 
audio including at least a^^audio from the callee terminal to 
the first interpreter terminal, and a function to transmit aH 
audio including at least a^^audio from the caller terminal to 
the second interpreter terminal . 

With this configuration, based on a call from the caller 
terminal, the terminal number of the first interpreter who 
interprets the language of the callee into the language of the 
caller and the terminal number of the second interpreter who 
interprets the language of the caller into the language of the 
callee are extracted from the interpreter registration table. 
The caller terminal, the callee terminal, the first interpreter 
terminal and the second interpreter terminal are automatically 
connected and ar-video and an-audio required for interpretation 
are communicated. The caller need not previously search for 
an interpreter and feei^conduct consultation with the callee^ 
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thus providing a videophone interpretation service which may 
be -is available even in an emergency. The interpreter can join 
a videophone conversation anywhere he/she may be, as long as 
he/she can be called. This minimizes the rootraint — time 
required of the interpreter and reduces the interpretation 
service cost. 

The videophone interpretation system according to claim 
8 io ln the videophone interpretation system according to claim 
1 , charactcrigcd in that p referred embodiments of the present 
invention, the communication means preferably includes a 
function to transmit ar-video obtained by synthesizing ar-video 
from the callee terminal as a main window and ar-video from the 
first interpreter terminal as a sub window to the caller terminal , 
a function to transmit a-video obtained by synthesizing ar-video 
from the caller terminal as a. main window and ar-video from the 
second interpreter terminal as a sub window to the callee 
terminal , a function to transmit arvideoobtainedby synthesizing 
a-video from the callee terminal and a-video from the caller 
terminal to the first interpreter terminal, and a function to 
transmit arvideo obtained by synthesizing a-video from the caller 
terminal and a-video from the callee terminal to the second 
interpreter terminal . 

This allowo enables the caller and the callee to check 
the expressions of the first interpreter and the second 
inter preter_j_ respectively^ in a Picture-in-Picture fashion 
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s osuch that it is easy to understand the voice of the interpreter. 
Each interpreter can check the expression of the caller and 
the expressionof the calleee esuch that aprecise interpretation 
is enabled. 

The videophone interpretation oyotcm according to claim 
9 is ln the videophone interpretation system according to claim 
7 or B p ref erred embodiments of the present invention , the 
communication means including a p referably includes a first 
audio transmission function to synthesize aR^audio from the 
callee terminal andaH-audio from the first interpreter terminal 
and transmit the resulta^^ to the caller terminal, a second 
audio transmission function to synthesize aH^audio from the 
caller terminal andaHraudio from the second interpreter terminal 
and transmit the resultant to the callee terminal , a third audio 
transmission function to transmit at least aB^audio from the 
callee terminal to the first interpreter terminal, and a fourth 
audio transmission function to transmit at least ajt-audio from 
the caller terminal to the second interpreter terminal, 
charactcrigcd in that w herein the first audio transmission 
function includes a callee audio suppression function to 
suppress aur-audio from the callee terminal when as-audio from 
the first interpreter terminal is detected and that the second 
audio transmission function includes a caller audio suppression 
function to suppress an^audio from the caller terminal when 
aHr-audio from the second interpreter terminal is detected. 



18 



According to •fefe evarious preferred embodiments of the 
present invention, when the audio of the first interpreter is 
detected, transmission of the original audio of the callee to 
the caller is suppressedby the callee audio suppression function . 
When the audio of the second interpreter is detected, 
transmission of the original audio of the caller to the callee 
is suppressed by the caller audio suppression function. With 
these functions, the caller and the callee can graop understand 
the interpretation even when their speech overlap that of the 
interpreter, thereby providing a speedy quick and precise 
videophone interpretation service. 

The suppression includes a case where in which the level 
of an audio signal is lowcrod reduced in order to allow hearing 
to some extent and a case whore in which the audio signal is. 
Qhut turned off so as to mute the audio. 

The videophone intcrprotation oyotcm according to claim 
10 io ln the videophone interpretation system according to claim 
^7 — oa? — 8 -pref erred embodiments of the present invention , the 
communication means including p ref erably includes a first audio 
transmission function to selectively transmit either an-audio 
from the callee terminal or aa-audio from the first interpreter 
terminal to the caller terminal, a second audio transmission 
function to selectively transmit either anraudio from the caller 
terminal or ebft-audio from the second interpreter terminal to 
the callee terminal, a third audio transmission function to 
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transmit at least aii-audio from the callee terminal to the first 
interpreter terminal, and a fourth audio transmission function 
to transmit at least a^i-audio from the caller terminal to the 
second interpreter terminal, charaGtcrigcd in that wherein the 
first audio transmission function includes a function to 
ghut turn off aR^audio from the callee terminal and transmit 
a^i-audio from the first interpreter terminal when detecting 
a^i-audio from the first interpreter terminal and that the second 
audio transmission function includes a function to shut off 
a^^audio from the caller terminal and transmit an-audio from 
the second interpreter terminal when detecting an-audio from 
the second interpreter terminal. 

According to fefe epreferred embodiments of the present 
invention, when the audio of the first interpreter is detected 
in the first audio transmission function, the original audio 
of the callee is switched to the audio of the first interpreter. 
When the audio of the second interpreter is detected in the 
second audio transmission function, the original audio of the 
callee is switched to the -audio of the second interpreter. 
With these functions, the caller and the callee can 
gr a gp u nde r s t and the interpretation even when their speech 
overlap that of each interpreter, thereby providing a 
gpccdy quick and precise videophone interpretation service. 




The videophone interpretation oyotom according to claim 
11 io ln the videophone interpretation system according to claim 
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7 or 8 , GharactQrigQdin that p referredembodiments of the present 
invention, the communication means preferably includes a first 
audio transmission function to perform audio multiplexing or: 
aa -of audio from the callee terminal and €bflt-audio from the first 
interpreter terminal and transmit the resultant- to the caller 
terminal , a second audio transmission function to perform audio 
multiplexing on an of audio from the caller terminal and aft 
audio from the second interpreter terminal and transmit the 
resultant- to the callee terminal, a third audio transmission 
function to transmit at least aH^audio from the callee terminal 
to the first interpreter terminal, and a fourth audio 
transmission function to transmit at least aii-audio from the 
caller terminal to the second interpreter terminal. 

According to fej^ epref erred embodiments of the present 
invention, in the first audio transmission function, the 
original audio of the callee and the audio of the first 
interpreter are audio multiplexed and the resultaftt is 
transmitted to the caller. In the second audio transmission 
function, the original audio of the caller and the audio of 
the second interpreter are audio multiplexed and the resultctRt 
voice is transmitted to the callee. With these functions, the 
caller and the callee can grasp under stand the interpretation 
even when their speech overlap that of each interpreter, thereby 
providing a spccd y quick and precise videophone interpretation 
service . 
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The videophone interpretation oyotcm according to claim 
12 io ln the videophone interpretation system according to any 
Oft epref erred embodiments of claima 7 through 11, charactcriged 
i^i — fefeafe rthe present invention, the communication means 
preferably includes a function to record a-video including a 
video from the caller terminal, arvideo from the callee terminal, 
ar-video from the first interpreter terminal and €t-video from 
the second interpreter terminal andaR-audio including aft-audio 
from the caller terminal, aft-audio from the callee terminal, 
an^audio from the first interpreter terminal and aaa^audio from 
the second interpreter terminal, and a function to reproduce 
and transmit the recorded video and audio by a request from 
a terminal . 

With this configuration, videos andaudios from the caller, 
callee, first interpreter and second interpreter in an 
interpretation service are recorded. A &Since the details of 
recording can be checked by a request from a terminal, it is 
possible to review the contents which were not clear oftat the 
epe ^time of the conversation or to check later the details of 
the communications service at a later time . 

A video may be recorded by recording a synthesized video 
of ar-video to be transmitted to a caller terminal and ar-video 
to be transmitted to a callee terminal. By doing so, it is 
possible to check the video received by the caller or the callee . 
An audio Audio may be recorded by recording aH-audio 
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obtained by performing audio multiplexing on OH^audio to be 
transmitted to a caller terminal and aR-audio to be transmitted 
to a callee terminal. By doing so, it is possible to check 
the contents by ^in the language of the caller and that in the 
language of the callee separately from a terminal equipped with 
an audio demultiplexing function, 

0¥-, — aft Al t e r na t i ve 1 y , audio to be transmitted to 

a caller terminal and a^i-audio to be transmitted to a callee 
terminal may be recorded separately and the audio of a side 
specified by a command from a terminal may be reproduced and 
transmitted. By doing so, it is possible to check the contents 
feyin the language of the caller and that in the language of the 
callee separately even from a terminal not equipped with an 
audio demultiplexing function. 

The vidcophono interpretation system according to claim 
13 is in the videophone interpretation system according to aay 
eae pref erred embodiments of claims 1 through 12, charactcrigod 
ift — fefea ^the present invention, selection information for 
selecting an interpreter is registered in the interpreter 
registration table and that the connection means preferably 
includes a function to acquire the conditions for selecting 
an interpreter from the caller terminal and a function to extract 
the terminal number of an interpreter who satisfies the acquired 
selection conditions by referencing the interpreter 
registration table. 
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This selects an interpreter who satisfies the 
objoot p urpose of a videophone conversation between a caller 
and a callee from among the interpreters registered in the 
interpreter registration table . Selection interpretation for 
selecting an interpreter includes information on a about the 
sex, aB— age, €b-habitation, a-specialty , and a—qualification. 

By registering the interpretation level of an interpreter 
by language in the interpreter registration table, the user 
can select an interpreter e #who has a desired level for an 
interpretation between specified languages- An interpreter 
can register a plirality p lurality of languages, if any, for 
which he/she can provide interpretation . This allowo enables 
flexible and efficient selection of an interpreter. 

In a videophone interpretation system via bidirectional 
simultaneous interpretation, a listening comprehension level 
and a speaking level may be separately registered as 
interpretation levels by language to be registered in the 
interpreter registration table. By doing so, it is possible 
to individually select a person who is optimum ao suitable a 
first interpreter and another who is optimum ao suitable for 
a second interpreter, thereby allowing enabling flexible and 
efficient selection of an interpreter. 

The videophone interpretation ayatcm according to claim 
1 4 ia in the videophone interpretation system according to any 
ea epref erred embodiments of claimo 1 through 13, charactGrigGd 
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in that the present invention, an availability flag to indicate 
whether an interpreter is available is preferably registered 
in the interpreter registration table and that the connection 
means preferably includes a function to refer to an availability 
flag in the interpreter registration table to extract the 
terminal number of an available interpreter. 

In this way manner , by registering whether an interpreter 
is available in the interpreter registration table, an available 
interpreter is automatically selected and called. This 
eliminates useless calling and provides a more flexible and 
efficient videophone interpretation system. 

The videophone interpretation syotcm according to claim 

15 is in the videophone interpretation system according to any 
ea epref erred embodiments of claims 1 through 14, characterized 
in that the present invention, the connection means preferably 
includes a function to generate a text message to be transmitted 
to each of the terminals and that the communication means 
includes a function to transmit the generated text message to 
each of the terminals. 

This transmits a text message which prompts each terminal 
to enter necessary information when connecting a caller terminal , 
a callee terminal and an interpreter terminal. 

The videophone interpretation system according to claim 

1 6 io ln the videophone interpretation system according to aey 
ea epref erred embodiments of claims 1 through 15, Gharactcrigcd 
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in that the present invention, the connection means preferably 
includes a function to generate a voice message to be transmitted 
to each of the terminals and that the communication means 
includes a function to transmit the generated voice message 
to each of the terminals. 

This transmits a voice message to a caller terminal, a 
callee terminal and an interpreter terminal when the caller 
terminal, callee terminal and interpreter terminal are to be 
connected. This makes it possible to provide a videophone 
interpretation service even when any of the caller, the callee 
and the interpreter is a visually impaired person. 

A videophone interpretation method according to claim 
17 is in the videophone interpretation system according to a-BY 
Ofte pref erred embodiments of claims 1 through 16, charactcrizod 
in that the present invention, the connection means preferably 
includes a function to register a term used during a conversation 
based on a command from each of the terminals and a function 
to extract the registered term and generate a telop based on 
a command from each of the terminals and that the communication 
means includes a function to transmit the generated telop to 
each of the terminals. 

In this way manner , by registering a term in advance that 
is difficult to interpret, it is possible to display a telop 
on each of the terminal terminals and to p rovide fetee-a^videophone 
interpretation service which is more quickly quick and accurate . 
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A videophone interpretation method according to claim 
1 8 io ln the videophone interpretation system according to asy 
OH epref erred embodiments of claimo 1 through 11 , characterized 
in that the present invention, accounting information enabout 
an interpreter is registered in the interpreter registration 
table and that the connection means preferably includes a 
function to mcasurc m easures the time that the caller terminal 
or callee terminal obtains an interpretation service and a 
function to calculate a fee from the measured time and accounting 
information registered in the interpreter registration table. 

By registering the accounting information e nabout an 
interpreter in the interpreter registration table, it is 
possible to account determine an appropriate fee for a videophone 
interpretation service . 

The interpreter registration table may register the 
interpretation level of an interpreter by language and an 
accounting table which specifies the relationship between the 
interpretation level and the hourly rates may be used to 
determine accounting information. By doing so, it is possible 
to account an appropriate fee corresponding to the level of 
the interpreter. 

A videophone interpretation method according to claim 
^r9 -pref erred embodiments of the present invention i s a videophone 
interpretation m ethod whore in which an interpreter interprets 
a videophone conversation between a caller and a callee uaing who 
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speak different languages, the method using an interpreter 
registration table where in which at least the language types 
interpretable by an interpreter and the terminal number of the 
interpreter are registered, charactorizod in that wherein the 
method compriooo includes steps of accepting a call from a caller 
terminal, acquiring the terminal number of a callee, language 
type of the caller and the language type of the callee from 
the caller terminal for which the call was accepted, extracting 
the terminal number of the interpreter by referencing the 
interpreter registration table from the acquired language type 
of the caller and language type of the callee, calling the 
interpreter terminal by using the terminal number of the 
interpreter extracted, calling the callee terminal by using 
the acquired terminal number of the callee, transmitting arvideo 
including at least ar-video from the callee terminal andaH-audio 
including at least as-audio from the interpreter terminal to 
the caller terminal, transmitting ar-video including at least 
a-video from the caller terminal andanraudio including at least 
aHr-audio from the interpreter terminal to the callee terminal, 
and transmitting an-audio including at least aft-audio from the 
caller terminal and aii-audio from the callee terminal to the 
interpreter terminal . 

With this configuration, upon a call from a caller terminal , 
the terminal number of an interpreter capable of interpreting 
between the language of the caller and the language of the callee 
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is extracted from the interpreter registration table, and the 
caller terminal, the callee terminal and the interpreter 
terminal are automatically connected, and ar-video and aH-audio 
required for interpretation are communicated. The caller need 
not previously search for an interpreter and hel^conduct 
consultation with the callee^ thus providing a videophone 
interpretation service which may be is available even in an 
emergency. The interpreter can join a videophone conversation 
anywhere he/she may be, as long as he/she can be called. This 
minimizes the rcatraint time ef — occupied by the interpreter 
and reduces the interpretation service cost. 

A videophone interpretation method according to c l aim 
20 is a videophone interpretation p referred embodiments of the 
present invention is a m ethod where in which a videophone 
conversation between a caller and a callee using different 
languages is interpreted by a first interpreter who interprets 
the language of a callee into the language of a caller and a 
second interpreter who interprets the language of the caller 
into the language of the callee, the method using an interpreter 
registration table where in which at least the language types 
interpretable by an interpreter and terminal number of the 
interpreter are registered, charactcrigod in that w herein the 
method Gompriaoa includes steps of accepting a call from a caller 
terminal, acquiring the terminal number of a callee, language 
type of the caller and the language type of the callee from 
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the caller terminal for which the call was accepted, extracting 
the terminal number of a first interpreter by referencing the 
interpreter registration table from the acquired language type 
of the callee and language type of the caller, calling the first 
interpreter terminal by using the terminal number of the first 
interpreter extracted, extracting the terminal number of a 
second interpreter by referencing the interpreter registration 
table from the acquired language type of the caller and language 
type of the callee, calling the second interpreter terminal 
by using the terminal number of the second interpreter extracted, 
calling the callee by using the acquired terminal number of 
the callee, transmitting a^video including at least a^video 
from the callee terminal and aa-audio including at least aft 
audio from the first interpreter terminal to the caller terminal , 
transmitting ar-video including at least ar-video from the caller 
terminal andaHraudio including at least aH-audio from the second 
interpreter terminal to the callee terminal, transmitting aft 
audio including at least aa-audio from the callee terminal to 
the first interpreter terminal, and transmitting an audio 
including at least aft— audio from the caller terminal to the 
second interpreter terminal. 

With this configuration, uponacall f rom a caller terminal , 
the terminal number of a first interpreter who interprets the 
language of the callee to the language of the caller and the 
terminal number of a second interpreter who interprets the 
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language of the caller into the language of the callee are 
extracted. The caller terminal , the callee terminal , the first 
interpreter terminal, and the second interpreter terminal are 
automatically connected, followed by communications of o-video 
and aft-audio required for interpretation. The caller need not 
previously search for an interpreter and hold conduct 
consultation with the callee_j_ thus providing a videophone 
interpretation service which may be available even in an 
emergency. The interpreter can join a videophone conversation 
anywhere he/she may be, as long as he/she can be called. This 
minimizes the rcotraint time e #occupied by the interpreter and 
reduces the interpretation service cost. 

The above object, — other objects Other features , 

elements, stepS ; characteristics and advantages of the present 
invention will b ebecome more apparent from the following 
detailed description of tiiepreferred embodiments ef — the 
invention. — 



Brief Description — — thereof with reference to the 

Drawingo attached drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a system block diagram of a videophone 
interpretation system according to a first preferred embodiment 
of the present invention; 
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Fig. 2 shows an example of a video displayed on the screen 
of a terminal in the videophone interpretation system according 
to the first preferred embodiment of the pre sent invent ion ; 

Fig. 3 shows an example of an interpreter registration 
table in the videophone interpretation system according to the 
first preferred embodiment of the present invention; 

Fig. 4 is a processing flowchart of the control processing 
of a controller in the videophone interpretation system 
according to the first preferred embodiment of the present 
inventions- 
Fig. 5 shows an example of a screen for prompting input 
of the language type of a caller and a callee. 

Fig. 6 shows an example of a screen for prompting input 
of interpreter selection conditions; 

Fig. 7 shows an example of a screen for prompting input 
of the terminal number of a callee; 

Fig. 8 is a system block diagram of a videophone 
interpretation system according to a second preferred 
embodiment of the pre sent invent ion ; 

Fig. 9 shows an example of a connection table; 

Fig. 10 is aprocessing flowchart of the control processing 
of a controller in the videophone interpretation system 
according to the second preferred embodiment of the present 
invention; 

Fig. 11 is a system block diagram of a videophone 
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interpretation system according to a third preferred embodiment 
of the present invention; 

Fig, 12 shows an example of a-video displayed on the screen 
of a terminal in the videophone interpretation system according 
to the third preferred embodiment of the present invention; 

Fig. 13 shows an example of an interpreter registration 
table in the videophone interpretation system according to the 
third preferred embodiment of the present invention; 

Fig. 14 is aprocessing flowchart of the control processing 
of a controller in the videophone interpretation system 
according to the third preferred embodiment of the present 
invention; 

Fig. 15 is a block diagram of showing an example of an 
audio communications function in the videophone interpretation 
system according to the first preferred embodiment of the present 
invention; 

Fig. 16 is a block diagram of showing another example 
of the audio communications function in the videophone 
interpretation system according to the first preferred 
embodiment of the present invention; 

Fig. 17 is a block diagram of showing an example of the 
audio communications function in the videophone interpretation 
system according to the third pref erred embodiment of the present 
invention; 

Fig. 18 is. a block diagram of showing another example 
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of the audio communications function in the videophone 
interpretation system according to the third preferred 
embodiment of the pre sent invent ion ; 

Fig. 19 is a block diagram of showing an example of a 
recording/reproduction function in the videophone 
interpretation system according to the first preferred 
embodiment of the present invention; 

Fig. 20 is a block diagram of showing an example of a 
recording/reproduction function in the videophone 
interpretation system according to the third preferred 
embodiment of the pre sent invent ion ; 

Fig. 21 shows an example of a— video displayed on each 
terminal screen by way of the recording/reproduction function; 
and 

Fig. 22 is a system block diagram of a p ar-videophone 

interpretation system using a videoconf erence service with a 
multipoint conferencing unit. 

Boot Mode for Carrying Out the Invention 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
Fig. 1 is a system block diagram of a videophone 
interpretation system according to a first preferred embodiment 
of the invention. This preferred embodiment shows a system 
configuration example assuming that a terminal used by a caller, 
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a callee or an interpreter is a telephone -type videophone 
terminal connected to a public telephone line. 

In Fig. 1, a numeral 100 represents a videophone 
interpretation system installed in an interpretation center 
which provides an interpretation service. The videophone 
interpretation system 100 interconnects a videophone terminal 
used by a caller (hereinafter referred to as a caller terminal) 
10, a videophone terminal used by a callee (hereinafter referred 
to as a callee terminal) 20, and a videophone terminal used 
by an interpreter (hereinafter referred to as an interpreter 
terminal) 30 via a public telephone line 40 in order to provides 
a videophone interpretation service where in which a videophone 
conversation between a caller and a callee is interpreted by 
an interpreter. 

The caller terminal 10 , callee terminal 20 and interpreter 
terminal 3 0 each comprises includes a television camera (a) for 
capturing each user, a display screen (b) for displaying the 
receivedvideo, adialpad (c) for input of a number or information, 
and a headset (d) for audio input/output. While input/output 
of voice is not necessarily made using a headset but^ a handset 
eftof a typical telephone set may be use d instead . 

Such a videophone terminal connected to a public line 
may be an ISDN videophone terminal based on ITU-T recommendation 
H.320. The present invention may use a videophone terminal 
which cmploys uses a unique protocol. 
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The public telephone line may be of a wireless type. The 
videophone terminal may be a cellular phone or a portable 
terminal equipped with a videophone function - 

The interpretation videophone system 10 0 
compri oca includes a caller terminal line interface (interface 
being hereinafter referred to as I/F) 120 to connect to a caller 
terminal, a callee terminal line I/F 140 to connect to a callee 
terminal, and an interpreter terminal line I/F 160 to connect 
to an interpreter terminal. To each I/F, a 

multiplexer/demultiplexer 122, 142, 162 for 

multiplexing/demultiplexing a video signal, an audio signal 
or a data signal, a video CODEC (coder/decoder) 124, 144, 164 
for compressing/expanding a video signal, and an audio CODEC 
126, 146, 166 for compressing/expanding an audio signal are 
connected. Eachlinel/F, each multiplexer/demultiplexer, and 
each video CODEC or each audio CODEC performs call control, 
streaming control and compression/expansion of a video/audio 
signal in accordance with a protocol used by each terminal. 

To the video input of the caller terminal video CODEC 
124, a video synthesizer 128 for synthesizing the video output 
of the callee terminal video CODEC 144, the video output of 
the interpreter terminal video CODEC 164 and the output of the 
caller terminal telop memory 132 are connected. To the video 
input of the callee terminal video CODEC 144 , a video synthesizer 
148 for synthesizing the video output of the caller terminal 
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video CODEC 124, the video output of the interpreter terminal 
video CODEC 164, and the output of the -callee terminal telop 
memory 152 are connected. 

To the video input of the interpreter terminal video CODEC 
164, a video synthesizer 168 for synthesizing the video output 
of the caller terminal video CODEC 124, the video output of 
the callee terminal video CODEC 144, and the output of the 
interpreter terminal telop memory 172 are connected. 

While video display of an interpreter may be omitted on 
a caller terminal or a callee terminal, understanding of the 
voice interpreted by the interpreter is made oaoy f acilitated 
by displaying the video of the interpreter, g esuch that it is 
preferable to be able to synthesize the video of an interpreter. 

While video display of a caller or a callee may be omitted 
on an interpreter terminal, understanding of the voice 
interpreted by the interpreter is made — caoy f acilitated by 
displaying the videos, -sesuch that it is preferable to be able 
to display the video of a caller or a callee. 

Fig. 2 shows an example of a video displayed on the screen 
of each terminal during a videophone conversation by way of 
the videophone interpretation system 100. Fig. 2 (a) shows the 
screen of a caller terminal, on which a synthesized video of 
a callee and an interpreter obtained by the video synthesizer 
128 is displayed. While the video of the callee is displayed 
as a main window and the video of the interpreter is displayed 
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as a sub window in a Picture-in-Picture fashion in this example, 
ar-the_Picture- in- Picture may display is aloo pooaiblc aasuming 
the video of the interpreter as a main window and the video 
of the callee as a sub window. Or, these videos maybe displayed 
in equal size. Fig. 2 (b) shows the screen of a callee terminal, 
on which a synthesized video of a caller and an interpreter 
obtained by the video synthcaigcd synthesizer 148 is displayed. 
While the video of the caller is displayed as a main window 
and the video of the interpreter is displayed as a sub window 
in a Picture-in-Picture fashion in this example, a — the 
Picture-in-Picture may display io aloo possible assuming the 
video of the interpreter as a main window and the video of the 
caller as a sub window. Or, these videos may be displayed in 
equal size. Fig. 2(c) shows the screen of an interpreter 
terminal, on which a synthesized video of a caller and a callee 
obtained by the video synthesizer 168 is displayed. 

To the audio input of the caller terminal audio CODEC 
126, an audio synthesizer 130 for synthesizing the audio output 
of the callee terminal audio CODEC 146 and the audio output 
of the interpreter terminal audio CODEC 166 are connected. To 
the audio input of the callee terminal audio CODEC 146, an audio 
synthesizer 150 for synthesizing the audio output of the caller 
terminal audio CODEC 12 6 and the audio output of the interpreter 
terminal audio CODEC 166 are connected. 

To the audio input of the interpreter terminal audio CODEC 
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166, an audio synthesizer 170 for synthesizing the audio output 
of the caller terminal audio CODEC 126 and the audio output 
of the callee terminal audio CODEC 146 are connected. 

The audio output of the interpreter terminal audio CODEC 
166 is input to a selector 174. Based on a command from an 
interpreter terminal , the audio output is supplied to the caller 
terminal audio synthesizer 130 in case the interpreter 
interprets the language of the callee to the language of a caller, 
and to the callee terminal audio synthesizer 150 in case the 
interpreter interprets the language of a caller to the language 
of the callee. As a result, the audio of the interpreter is 
transmitted to either the caller or the callee requiring the 
audio. Thus, it is possible to prevent the speech of a caller 
or a callee from being disturbed by the unnecessary voice of 
an interpreter, thereby providing a smooth conversation. 

The caller terminal audio synthesizer 130 is equipped 
with a function to suppress an audio level from the callee 
terminal or switch an audio from the callee terminal to an audio 
from the interpreter terminal when an audio from the interpreter 
terminal is detected. The callee terminal audio synthesizer 
150 is equipped with a function to suppress an audio level from 
the caller terminal or switch an auido audio from the callee 
terminal to aH^audio from the interpreter terminal when aR-audio 
from the interpreter terminal is detected. This prevents 
overlapping of the audio of the interpretation by the interpreter 
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over the audio of the opponent party which causes difficulty 
in listening. The interpreter can simultaneously interpret 
the speech of the speaker_^ thus al lowing enabl ing a spccdy quick 
and precise interpretation. 

Fig. 15 shows specific examples of the function to switch 
the destination of the interpreter audio in the selector 174 
and the function to suppress the audio of the callee or caller 
in the audio synthesizers 130, 150. As shown in Fig. 15, the 
audio output of the interpreter terminal audio CODEC 166 is 
connected to a caller terminal audio signal adder 190 and aft 
a callee terminal audio signal adder 193 via the switch 174. 
The audio of the interpreter is supplied to either the caller 
ef -or callee by a signal from a PB detector 175. The PB detector 
175 detects a predetermined number for selecting a caller or 
a callee on the dial pad of a terminal that is pressed based 
on a data signal or a tone, signal included in an audio signal 
from the interpreter terminal, and switches the selector 174 
iftto the specified side. The interpreter specifies the caller 
or callee as a destination of his/her voice by the dial pad 
before he/she interprets. Thus, the caller or the callee who 
need not listen to the audio of the interpreter does not receive 
the audio of the interpreter. 

To the caller terminal audio oignal adder 190 io connQCtcd 

the The audio output of the callee terminal audio CODEC 146 
is connected to the caller terminal audio signal adder 190 v ia 
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an attenuator 191, which attenuates the audio from the callee 
terminal when the audio from the interpreter is detected by 
the signal detector 192. To the calloc terminal audio aignal 
adder 1Q3, thc The audio output of the caller terminal audio 
CODEC 12 6 is connected to the callee terminal audio signal adder 
193 via an attenuator 194, which attenuates the audio from the 
caller terminal when the audio of the interpreter is detected 
by the signal detector 195. The signal detectors 192, 195 are 
set to an appropriate detection level in order to prevent the 
audio of the opponent party from being attenuated by mistake 
due to ar-noise and the like . 

In order to ensure that the caller or the callee can hear 
the audio of the interpreter immediately after the audio of 
the interpreter is detected by the signal detector 192, 195, 
an appropriate signal delay unit may be provided at the 
interpreter audio input of the audio signal adder 190, 193. 

While the audio of the opponent party is attenuated by 
the attenuator 191, 194 o esuch that the caller or the callee 
can hear the original voice of the opponent party to some extent 
in the background of the audio of the interpreter in this 
embodiment, a switch may be ueedr -provided instead to shut turn 
off the audio of the opponent party. 

Fig. 16 shows an example whoro in which the audio of the 
opponent party is shut — turned off when the audio of the 
interpreter is transmitted and only the audio of the interpreter 
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is transmitted. As shown in Fig. 16, switches 197, 198 are 
used instead of the audio signal adders 190, 193. When the 
audio of the interpreter is detected by the signal detectors 
192, 195, the switches 197, 198 are turned from the audio of 
the opponent party to the audio of the interpreter. The 
remaining configuration is the same as that shown in Fig. 15. 

In this caoG aloo addition , in order to ensure that the 
caller or the callee can hear the audio of the interpreter 
immediately after the audio of the interpreter is detected by 
the signal detector 192, 195, an appropriate signal delay unit 
may be provided at the interpreter audio input of the switches 
197, 198. 

While the audio signal adder 190, 193 simply adds the 
audio of the interpreter and the audio of the opponent party 
in the above example, audio multiplexing of two signals may 
be cmployod u sed as well. For example in caoc , if a terminal 
supports a stereophonic audio, a stereophonic synthesis is 
performed to the audio of the opponent party as the left channel 
and the audio of the interpreter as the right channel and the 
resultaftt signal is transmitted to a terminal, where the 
receiving party selects a necessary audio. In this 
configuration, it is not necessary to provide an attenuator 
to attenuate the audio of the opponent party in the videophone 
interpretation system. The receiving party listens to the 
audioa audio w hile adjusting the volume balance of the right 
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and left channels of a headset. 

While the audio of the interpreter is transmitted to either 
the caller or the callee as selected by the switch 174 in the 
above example, the audio of the interpreter may be supplied 
to each of the audio signal adder 190 (or the switch 197) and 
the audio signal adder 193 (or the switch 198) via an attenuator 
in order to attenuate an audio signal to a party where the audio 
is not required based on detection by the PB detector 175. In 
this wa ymanner , some of the audio of the interpreter is 
transmitted to the speaker by^using an attenuator . The speaker 
thus checks that his/her speech is interpreted while he/she 
is speaking. 

The videophone interpretation system 10 0 is equipped with 
an interpreter registration table 112 where in which the terminal 
number of an interpreter is registered and includes a controller 
110 connected to each of the line I/Fs 120, 140, 160, 
multiplexers/demultiplexers 122, 142, 162 , video synthesizers 
128, 148, 168, audio synthesizers 130, 150, 170, and telop 
memories 132, 152, 172. The controller 110 provides a function 
to connect a caller terminal, a callee terminal and an 
interpreter terminal by way of using a function to accept a 
call from a caller terminal, a function to acquire the language 
type of the caller and the language type of the callee , a function 
to acquire the selection conditions for selecting an interpreter , 
a function to extract the terminal number of the interpreter 
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by referencing the interpreter registration table 112 by^using 
the acquired language type and selection conditions, a function 
to call the interpreter terminal by^using the terminal number 
of the interpreter extracted, and a function to call the callee 
terminal by-using the acquired terminal number of the callee. 

Operation of the video synthesizers 128, 148, 168 and 
audio synthesizers 13 0 , 150, 170 is controlled by the controller 
110. A function is included where in which the user changes 
the video output method or audio output method by pressing a 
predetermined number button of a dial pad of each terminal. 
Thia io implemented that thc The multiplexer/demultiplexer 122, 
142 , 162 detects the number button on the dial pad of each terminal 
that is pressed based on a data signal or a tone signal and 
signals the detection to the controller. This ensures 
flexibility in the usage of the system on each terminal. For 
example, only necessary videos or audios are selected and 
displayed/output in accordance with the object or it is possible 
to replace a main window with a sub window, or change the position 
of the sub window. 

To the input of the audio synthesizers 128, 148, 168, 
a caller terminal telop memory 132 , a calco callee terminal telop 
memory 152, and aan interpreter terminal telop memory 172 are 
connected respectively. Contents of each telop memory 132, 
152, 172 can be set from by the controller 110. With this 
configuration, by setting a message to be displayed on each 
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terminal to the telop memory 132, 152, 172 and issuing a command 
to select a signal of the telop memory 132, 152, 172 to the 
audio synthesizer 128, 148, 168 in the setup of a videophone 
conversation via interpretation, it is possible to transmit 
necessary messages to respective terminals to establish a 
three-way call. 

In caoc lf there is a term which is difficult to explain 
or a word which that is difficult to pronounce in a videophone 
conversation, it is possible to register in advance the term 
in the term registration table 113 of the controller 110 in 
association with the number of the dial pad on each terminal. 
By doing so, it is possible to detect that the dial pad on each 
terminal is pressed during a videophone conversation by using 
a data signal or a tone signal on the multiplexer/demultiplexer 
122, 142, 162, extract a term corresponding to the number of 
the dial pad pressed from the term registration table 113, 
generate a text telop, and set the text telop to each telop 
memory, thereby displaying the term on each terminal. This 
communicates, by way of a text telop, to the opponent party 
a term which that is difficult to explain or a word which that 
is difficult to pronounce, to thus providing a spccdicr p rovide 
a quicker and more precise videophone conversation. 

Next, the connection processing by the controller 110 
for establishing a videophone conversation via interpretation 
is described. 
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Prior toprocessing, a^hinterpreter selection information 
and a terminal number of a terminal used by each interpreter 
are registered in the interpreter registration table 112 of 
the controller 110 from an appropriate terminal (not shown) . 
Fig. 3 shows an example of a registration item to be registered 
in the interpreter registration table 112. The interpreter 
selection information is information for selecting a 
interpreter desired by a user, which includes a gender, an age, 
supported languages, a habitation, a specialty, and the like. 
For the supported languages, the level of an interpreter is 
registered by language to allow enable the user to select an 
interpreter of a desired level between the target languages. 
In this example, the levels of interpretation are represented 
by 1 (Advanced) , 2 (Middle) and 3 (Basic) . The habitation 
assumes a case where in which the user desires a person who has 
geographic knowledge on a specific area and, in this example, 
a ZIP code is used to specify an area. The specialty assumes 
a case where, — in caoo in which, if the conversation pertains 
to a specific field, the user desires a person who has expert 
knowledge on the field or is familiar with the topics in the 
field. In this example, the fields an interpreter is familiar 
with are classified into several categories to be registered, 
such as politics, law, business, education, science and 
technology, medical care, language, sports, and hobby. The 
specialties are diverse, sesuch that they may be registered 
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hierarchically and searched through at a level desired by the 
user when sclcctod . 

In addition, qualifications of the interpreter may be 
registered in advance -sesuch that the user can select a qualified 
person as an interpreter. 

The terminal number to be registered is the telephone 
number of the terminaly because_j_ in this example_j_ a videophone 
terminal to connect to a public telephone line is 
go oumod . p rovided . 

In the interpreter registration table 112 io provided 
an availability flag is provided to indicate whether an 
interpreter accepts the interpretation . A registered 
interpreter can call the interpretation center from his/her 
terminal and enter a command by using a dial pad to set/reset 
the availability flag. Thus, an interpreter registered in the 
interpreter registration table can set the availability flag 
only when he/she is available for interpretation, thereby 
eliminating useless calling and allowing enabling the user to 
select an available interpreter without delay. 

Fig. 4 shows a processing flowchart of the connection 
processing by the controller 110 . The videophone 
interpretation system 100 accepts an order for an interpretation 
service when the caller calls a telephone number of the caller 
terminal line l/F. The videophone interpretation system 100 
then calls the interpreter terminal and the callee terminal, 
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and establishes a connection for the videophone interpretation 
service . 

As shown in Fig. 4, the presence of a call to the caller 
terminal line I/F 120 is detected initially (SlOO) . When a call 
is detected, a screen to prompt input of the language type of 
the caller is displayed on the caller terminal (S102) . This 
is accomplished^ for example_^ by setting a message shown in 
Fig. 5(a) to the caller terminal telop memory 132 . The language 
type of the caller input by the caller is acquired (S104) . 
Afterwards , messaging to the caller terminal and the interpreter 
terminal is made -provided u sing the language type of the caller 
acquired. Next, a screen to prompt w hich prompts input of a 
language type of the callee is displayed on the caller terminal 
(S106) . This is accomplished_j_ for example_j_by setting amessage 
shown in Fig. 5(b) to the caller terminal telop 132 . The language 
type of the callee input by the caller is acquired (S108) . 
Afterwards, messaging to the callee terminal is made using the 
language type of the callee acquired. 

A screen to prompt w hich prompts input of interpreter 
selection conditions is displayed on the caller terminal (SllO) . 
This is accomplished_j_ for example_j_ by setting a message shown 
in Fig. 6(a) to the caller terminal telop memory 132. The 
interpreter selection conditions input by the caller are 
acquired (S112) . The interpreter selection conditions input 
by the caller are a gender, an age bracket, an area, a specialty 
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and an interpretation level. The area is specified by using 
a ZIP code and an interpreter is selected start ing b eginning 
with the habitation closest to the specified area. For any 
selections, incasc lf it is not necessary to specify a condition-r 
for any selections, "N /A" may be selected. 

Next, an interpreter who has a specified interpretation 
level of the language of the caller and the language of the 
callee, and whose gender, age, habitation and specialty satisfy 
the acquired selection conditions, with his/her availability 
flag being set is extracted rof orring w ith reference to the 
interpreter registration table 112, and the caller terminal 
displays a list of interpreter candidates to prompt and prompts 
input of the selection number of a desired interpreter (S114) . 
This is accompli shed_^ for example^ by setting a message and 
an interpreter list shown in Fig. 6(a) to the caller terminal 
telop memory 132. In this practice, thc The hourly rates of 
the interpreter (not shown) registered in the interpreter 
registration table 112 -i- &are then extracted and displayed as 
a fee. This al 1 owa enabl e s the user to consider the cost of 
the interpretation service before selecting an appropriate 
interpreter. The hourly rates of the interpreter may be 
determined from the interpretation level of the selected 
interpreter by referencing an accounting table which specifies 
the relationship between the interpretation level and the hourly 
rates. The selection number input by the caller referring to 
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the interpreter candidate list is acquired (S116) . The terminal 
number of the selected interpreter is extracted from the 
interpreter registration table 112 and called (S118) . Personal 
information e aabout a caller, language types of the caller and 
callee, and interpreter selection conditions may be 
communicated to the interpreter terminal by using the 
interpreter terminal telop memory 172 so as to accept the 
interpretation. Personal information e ft about the caller may 
be available for example from pre -registered member information 
for the interpretation service being a membership service. 

When a response is received from the interpreter terminal 
(S120) , a screen to prompt which prompts input of the terminal 
number of the callee is displayed on the caller terminal (S122) . 
This is accomplished^ for example_^ by setting a message shown 
in Fig. 7 to the caller terminal telop memory 132 . The terminal 
number of the callee input by the caller is extracted and the 
callee is called (S124) . Same ao the above Similar to the 
procedure described above , personal information e aabout a 
caller, language types of the caller and callee, and interpreter 
selection conditions may be communicated to the callee terminal 
by using the callee terminal telop memory 152 so as to confirm 
whether to accept the call and to determine whether an error 
in the set conditions-? — if any, has occurred. 

When a response is received from the callee terminal (S126) , 
a videophone interpretation service otarta b egins (S128) . 
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In caoc if a response is not received from the interpreter 
terminal in S120, whether another candidate is available is 
determined (S130) . In caoo lf another candidate is available, 
execution returns to S118 and the procedure is repeated, ift 
caoc if another candidate is unavailable, the caller terminal 
is notified as — of such and the call is released {S132) . 
caac lf a response is not received from the callee terminal in 
S126, the caller terminal and the selected interpreter terminal 
are notified as — of such and the call is released (S134) , 

The controller 110 has includes a timer (not shown) for 
calculating the fee of the interpretation service. The timer 
measures the time from when the connection is established to 
when it is released. On completion of an interpretation service , 
the fee is calculated f rom based the time measured by the timer 
and the hourly rates mentioned above and registered in a 
accounting database 114, and charged to the user at a later 
time . 

While in caoQ When the selected interpreter terminal 

does not accept the call, the caller is simply notified as-of 
such and the call is released in the abovc p ref erred embodiment 
described above , an interpretation reservation table to 
register a caller terminal number and a callee terminal number 
may be provided and the caller and the callee may be notified 
eo — by a later response from the selected interpreter to set 
a videophone conversation. 
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While the caller is prompted to input the language types 
of the caller and the callee for selection of an interpreter 
in this preferred embodiment , a telephone number of an 
interpretation center may be specified per language type of 
the caller or per combination of the language type of the caller 
and the language type of the callee in order to acquire the 
language type of the caller or the callee. While the caller 
is prompted to input the interpreter selection conditions for 
selecting an interpreter in this preferred embodiment , the 
caller may be-first inquired of b e prompted whether to specify 
the interpreter selection conditions, and in caoc if he/she 
has dctcrmincd decided not to specify the interpreter selection 
conditions, only the input language types may be used to select 
an interpreter. 

Configuration io allowed A configuration is 

provided w here, in an emergency, the caller first dials a 
specific number to automatically call an interpreter dedicated 
to an emergency caoc . situation. 

While the videophone interpretation system 10 0 
compriaco includes a line l/F, a multiplexer/demultiplexer, a 
video CODEC, an audio CODEC, a video synthesizer, an audio 
synthesizer and a controller in ^fefee — above thi s pre f erred 
embodiment, these components need not be implomontcd provided 
by individual hardware (H/W ) but ) , and instead the function 
of each component may be implomontcd p rovided by software 
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processing running on a computer. 

While the interpreter terminal 30, oamc aa similar to 
the caller terminal 10 and the callee terminal 20, is located 
outside the interpretation center and called from the 
interpretation center over a public telephone line to provide 
an interpretation service in the abovo this preferred embodiment, 
P 3: e s e n t i nve n t i on is not limited thereto but , and some 
or all of the interpreter terminals may be installed in the 
interpretation center s esuch that the interpretation services 
are provided from the interpretation center. 

In the above In this preferred embodiment , an 

interpreter can join an interpretation service anywhere he/she 
may be, as long as he/she has a terminal which can be connected 
to a public telephone line. Thus_^ the interpreter can provide 
an interpretation service by using the availability flag to 
make efficient use of free time. This allowa enables efficient 
and otably stable operation of interpretation services which 
often have difficulty in securing necessary personnel. 

While a video signal of the home terminal is not input 
to the video synthesizers 128, 148, 168 in the — abovo this 
preferred embodiment, a function may be provided to input the 
video signal of the home terminal_j_ and synthesize and display 
the video signal to check the video on the terminal. 

While the video synthesizers 128, 148, 168 are used to 
synthesize videos for each terminal in the abovo this preferred 
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embodiment, the present invention is not limited thereto but j _ 
and v ideos from all terminals may be synthesized at oncc the 
same time, and the resultant may be transmitted to each terminal . 
In this case, as shown in Fig. 21(a) for example, a video of 
the caller, a video of the callee and a video of the interpreter 
may be displayed in a four split screen. 

While a function is provided whereby the telop memories 
132, 152, 172 are provided and their outputs are added to the 
corresponding video synthesizers 128 , 148, 168^ respectively-_^ 
in order to display a text telop on each terminal in the above this 
preferred embodiment, a function may be provided whereby telop 
memories to store audio information are provided and each output 
is added to the audio synthesizers 130, 150, 170 in order to 
output an audiomessage on each terminal . This makes it possible 
to provide a videophone interpretation service even in cago if 
any of the caller, the callee or the interpreter is a visually 
impaired person. 

Fig. 8 is a system block diagram of a videophone 
interpretation system according to a second preferred 
embodiment of the invention. In this preferred embodiment, 
the system configuration example includes the terminals used 
by a caller, a callee and an interpreter that are IP (Internet 
Protocol) type videophone terminals to be connected to the 
Internet equipped with a web browser. 

In Fig. 8, a numeral 200 represents a videophone 
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interpretation system installed in an interpretation center 
to provide an interpretation service. The videophone 
interpretation system 200 connects a caller terminal 60 used 
by a caller, a callee terminal 70 used by a callee, and any 
of the interpreter terminals used by an interpreter 231, 232,... 
via the Internet 8 0 in order to provide a videophone 
interpretation service to the caller and the callee. 

While the caller terminal 60, the callee terminal 70 and 
the interpreter terminal 231, 232,... each comprises includes a 
general -purpose processing device (a) such as a personal 
computer having a video input I/F function, an audio input/output 
I/F function and a network connection function, the processing 
device equipped with a keyboard (b) and a mouse (c) for input 
of information as well as a display (d) for displaying a web 
page screen presented by a web server 210 and a videophone screen 
supplied by a communications server 220, a television camera 

(e) for capturing the video of a each terminal user , and a headset 

(f ) for performing audio input/output for each terminal user, 
and the processing device has IP videophone software and a web 
browser installed in this example, a dedicated videophone 
terminal may be used instead. 

The videophone terminal connected to the Internet may 
be an IP videophone terminal based on ITU-T recommendationH . 323^ 
However , the invention is not limited there to-b^fe -, and may use 
a videophone terminal which employs a unique protocol . 
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The Internet maybe of a wire less LAN type . The videophone 
terminal may be a cellular phone or a portable terminal equipped 
with a videophone function and also including a web access 
function. 

The videophone interpretation system 200 
Gomprioco ! includes a communications server 22 0 including a 
connection table 222 for setting the terminal addresses of a 
caller terminal, a callee terminal and an interpreter terminal 
ao well — etB —, and a function to interconnect the terminals 
registered in the connection table 222 and synthesize ar-video 
and an audio received from each terminal and transmit the 
synthesized video and audio to each terminal^^ a web server 
210 including an interpreter registration table 212 for 
registering the interpreter selection information, terminal 
address and availability flag of each interpreter as mentioned 
earlier, ao well aa described above, and a function to select 
a desired interpreter based on an access from a caller terminal 
by using a web browser and set the terminal address of each 
of the caller terminal, the callee terminal and interpreter 
terminal in the connection table 222 of the communications server 
220r_j_ a router 250 for connecting the web server 210 and the 
communications server 220 to the Internetr^ and a plurality 
of interpreter terminals 231, 232,..., 23N connected to the 
communications server 220 via a network. 

Fig. 9 shows an example of a connection table 222. As 
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shown in Fig. 9, the terminal address of a caller terminal, 
the terminal address of a callee terminal and the terminal 
address of an interpreter terminal are registered together as 
a set in the connection table 222. This provides a single 
interpretation service. The connection table 222 is designed 
to register a plurality of such terminal address sets depending 
on the throughput of the communications server 220, thereby 
simultaneously providing a plurality of interpretation 
services . 

While the terminal address registered in the connection 
table 222 is an address on the Internet and is generally an 
IP address, the invention is not limited thereto but , and, for 
example^ a name given by a directory server may be used. 

The communications server 22 0 performs packet 
communications using a predetermined protocol with the caller 
terminal, the callee terminal and interpreter terminal set to 
the connection table 222 and provides-, by way of software 
processing, the functions similar to those provided by a 
multiplexer/demultiplexer 122, 142, 162, a video CODEC 124, 
144, 164, an audio CODEC 126, 146, 166, a video synthesizer 
128, 148, 168, an audio synthesizer 130, 150, 170 in the 
videophone interpretation system 100 . 

With this configuration, game qd similar to the videophone 
interpretation system 10 0, prescribed videos and audios are 
communicated between a caller terminal, a callee terminal and 
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an interpreter terminal, and a videophone interpretation 
service is provided between the caller and the callee. 

While the videophone interpretation system 100 preferably 
uses the controller 110 and the telop memories 132, 152, 172 
to extract a term registered in the term registration table 
113 during a videophone conversation by a command from a terminal 
and displays the term as a telop on the terminal , the same function 
may be provided by way — ef — software processing by the 
communications server 22 0 in this preferred embodiment aloo . 
A term specified by each terminal may be displayed as a popup 
message on the other terminal by way of the web server 210. 
Or, a telop memory may be provided in the communications server 
220 and a term specified by each terminal may be written into 
the telop memory via the web seryer 210 to display a text telop 
on each terminal . 

While the aforementioned interpretation center uses the 
controller 110 to interconnect a caller terminal, a callee 
terminal and an interpreter terminal, the connection procedure 
is made by the web server 210 in this pref erred embodiment because 
each terminal has a web access function. 

Fig. 10 is aprocessing flowchart of a connect ion procedure 
by the web server 210 . In the videophone interpretation system 
200, a caller terminal may access and log into the web server 
210 in the interpretation center, which starto b egins the 
acceptance of the interpretation service. 
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As shown in Fig. 10^ the web server 210 first acquires 
the terminal address of a caller (S200) and sets the terminal 
address to the connection table 222 (S202) . Next, the web server 
delivers a screen to prompt which prompts input of the language 
type of the caller^ similar to that shown in Fig. 5 (a-)-) , (3204) 
to the caller terminal. The language type of the caller input 
by the caller is acquired (3206) . The web server delivers a 
screen to prompt input of the language type of the callee^ similar 
to that shown in Fig. 5 (b-)-) , (5208) to the caller terminal. 
The language type of the callee input by the caller is acquired 
(3210) . The web server delivers a screen to prompt input of 
the selection conditions^ similar to that shown in Fig. 6 (a-)-) , 
to the caller terminal (S212) , The interpreter selection 
conditions input by the caller are acquired (3214) . 

Next, an interpreter with an availability flag set is 
selected from among the interpreters satisfying the language 
type and selection conditions referring to the interpreter 
registration table 212. The web server 210 delivers a list 
of interpreter candidates_j_ similar to that shown in Fig. 6 (b-)-) , 
to the caller terminal to prompt input of the selection number 
of a desired interpreter (3216) . The selection number of the 
interpreter input by the caller is acquired and the terminal 
address of the selected interpreter is acquired from the 
interpreter registration table 212 (3218) . Based on the 
acquired terminal address of the interpreter, the web server 
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210 delivers a calling screen to the interpreter terminal (S220) . 
In oaoc if the call is accepted by the interpreter (3222) , the 
terminal address of the interpreter is set ^by the connection 
table 222 (S224) . The web server 210 delivers a screen to prompt 
input of the terminal address of the callee_^ similar to that 
shown in Fig. 1 j_ to the caller terminal (S226) The terminal 
address of the callee input by the caller is acquired (S228) . 
Based on the acquired terminal address of the callee, the web 
server 210 delivers a calling screen to the callee terminal 
(S230) . In caoQ If the call is accepted by the callee terminal 
(S232) , the callee terminal address is set to the connection 
table 222 (S234) . Then, a videophone interpretation service 
starts b egins (S236) • 

In caoc lf the interpreter terminal does not accept the 
call in S222, whether another candidate is available is 
determined (823 8) . In caac lf another candidate is available, 
the web server delivers a message to prompt the caller to select 
another candidate to the caller terminal- (3240) , then execution 
returns to S218. In casc if another candidate is not found, 
the web server notifies the caller terminal as -of such (S242) 
and the call is released. In caoc lf the callee terminal does 
not accept the call in S232 , the caller terminal and the selected 
interpreter terminal are notified a^ — of such (S244) and the 
call is released. 

While in caoc When the selected interpreter terminal 
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does not accept the call, the caller is notified a&-of_such 
and the call is released in the abovo this preferred embodiment_^ 
However , an interpretation reservation table to register a 
caller terminal address and a callee terminal address may be 
provided and the caller and the callee may be notified enr-in 
a later response from the selected interpreter to set a 
videophone interpretation service. 

While the interpreter terminal is located in the 
videophone interpretation system 200 of the interpretation 
center in the abovo this preferred embodiment, the present 
invention is not limited thereto but , and some or all of the 
interpreters may be installed outside the interpretation center 
and connected via the Internet. Theses terminals may be 
addressed by the same processing. 

In the above In this preferred embodiment , the 

configuration of the videophone interpretation system has been 
described for a case where in which a videophone terminal used 
by a caller, a callee or an interpreter is a telephone -type 
videophone terminal connected to a public telephone line^ and 
a case whoro in which the videophone terminal is an IP-type 
videophone terminal connected to the Internet, the 
telephone -type videophone terminal and the IP-type videophone 
terminal can communicate with each other by arranging providing 
a gateway to perform protocol conversion therebetween. A 
videophone interpretation system conforming to one protocol 



61 



may be provided to support a videophone terminal which uses 
another protocol . 

In this way manner , the videophone interpretation system 
gllowo enables the user to cnj oy receive or provide an 
interpretation service anywhere he/she may be, as long as he/she 
has a terminal which can be connected to a public telephone 
line or the Internet. An interpreter does not always have to 
visit an interpretation center^ but can join a conversation 
via interpretation from his/her home or a facility or site where 
a videophone terminal is located, or provide an interpretation 
service by using a cellular phone or a portable terminal equipped 
with a videophone function. 

A person with the ability of interpretation skills m ay 
wish to register in the interpreter registration table in the 
interpretation center in order to provide an interpretation 
service anytime when it is convenient ^fea -for him/her. From the 
viewpoint of the operation of the interpretation center, it 
is not necessary to aummon for the interpreters to be at the 
center. This gllowo enables efficient operation of the 
interpretation center both in terms of time and costs. 

While one interpreter performs both interpretation from 
the language of the callee into the language of the caller and 
interpretation from the language of the caller into the language 
of the callee in the abovo this preferred embodiment, a first 
interpreter to interpret the language of the callee into the 
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language of the caller and a second interpreter to interpret 
the language of the caller into the language of the callee may 
be individually -setprovided to perform a bidirectional 
simultaneous interpretation. 

Fig. 11 shows an example of the system configuration of 
a videophone interpretation system which provides a 
bidirectional simultaneous interpretation according to a third 
preferred embodiment of the present invention. While this 
example uses a telephone- type videophone , an IP- type videophone 
may be used inatcad as mentioned above. 

In Fig. 11, a numeral 300 represents a videophone 
interpretation system installed in an interpretation center 
which provides a bidirectional simultaneous interpretation 
service. The videophone interpretation system 3 00 

interconnects a videophone terminal used by a caller 
(hereinafter referred to as a caller terminal) 10, a videophone 
terminal used by a callee (hereinafter referred to as a callee 
terminal) 20, a videophone terminal used by a first interpreter 
(hereinafter referred to as a first interpreter terminal) 32, 
and a videophone terminal used by a second interpreter 
(hereinafter referred to as a second interpreter terminal) 34 
via a public telephone line 40 in order to provide a videophone 
interpretation service where in which a videophone conversation 
between a caller and a callee is interpreted by the first 
interpreter and the second interpreter. 
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The videophone interpretation system 300 
GomprioGo includes a caller terminal line I/F 320, a callee 
terminal line I/F 340, a first interpretation terminal line 
l/F 360 and a second interpretation terminal line I/F 380. To 
each I/F_^ a multiplexer/demultiplexer 322, 342, 362, 382 for 
multiplexing/demultiplexing a video signal, an audio signal 
or a data signal, a video CODEC (coder/ decoder) 324, 344, 364, 
3 84 for compressing/ expanding a video signal , and an audio CODEC 
326, 346, 366, 386 for compressing/expanding an audio signal 
are connected. Each line I/F, each multiplexer/demultiplexer, 
and each video CODEC or each audio CODEC performs call control, 
streaming control and compression/ expansion of a video/audio 
signal in accordance with a protocol used by each terminal . 

To the video input of the caller terminal video CODEC 
324, a video synthesizer 328 for synthesizing the video output 
of the callee terminal video CODEC 344, the video output of 
the first interpreter terminal video CODEC 3 64 and the output 
of the caller terminal telop memory 332 is connected. 

To the video input of the callee terminal video CODEC 
344, a video synthesizer 348 for synthesizing the video output 
from the caller terminal video CODEC 324, the video output from 
the second interpreter terminal video CODEC 3 84 , and the output 
of the callee terminal telop memory 352 is connected. 

To the video input of the first interpreter terminal video 
CODEC 364, a video synthesizer 368 for synthesizing the video 
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output of the caller terminal video CODEC 324, the video output 
of the callee terminal video CODEC 344, and the output of the 
first interpreter terminal telop memory 372 is connected. 

To the video input of the second interpreter terminal 
video CODEC 384, a video synthesizer 388 for synthesizing the 
video output of the callee terminal video CODEC 344, the video 
output of the caller terminal video CODEC 324, and the output 
of the second interpreter terminal telop memory 3 92 isconnected. 

While video display of a first interpreter or a second 
interpreter may be omitted on a caller terminal or a callee 
terminal, understanding of the voice interpreted by the 
interpreter is made cas y f acilitated by displaying the video 
of the interpreter, sesuch that it is preferable to be able 
to synthesize the video of an interpreter. 

While video display of a caller or a callee may be omitted 
on a first interpreter terminal or a second interpreter terminal , 
understanding of the voice interpreted by the interpreter is 
made caa y f acilitated by displaying the videos, e esuch that it 
is preferable to be able to display the video of a caller or 
a callee. 

Fig. 12 ahowg (a) - (d) show an example of a^video displayed 
on the screen of each terminal during a videophone conversation 
by way of v ia the videophone interpretation system 3 00. Fig. 
12(a) shows the screen of a caller terminal, on which a 
synthesized video of a caller and a first interpreter obtained 
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by the video synthesizer 328 is displayed. While the video 
of the callee is displayed as a main window and the video of 
the first interpreter is displayed as a sub window in a 

Picture-in-Picture fashion in this example, a the 

Picture-in-Picture may also display io aloo poosiblc aoouming 
the video of the first interpreter as a main window and the 
video of the callee as a sub window. Or, these videos may be 
displayed in equal size . Fig . 12 (b) shows the screen of a callee 
terminal, on which a synthesized video of -a caller and a second 
interpreter obtained by the video synthesizer 348 is displayed. 
While the video of the caller is displayed as a main window 
and the video of the second interpreter is displayed as a sub 
window in a Picture-in-Picture fashion in this example, a -the 
Picture-in-Picture may also display is also possible assuming 
the video of the second interpreter as a main window and the 
video of the callee as a sub window. Or, these videos may be 
displayed in equal size . Fig. 12(c) shows the screen of a first 
interpreter terminal, on which a synthesized video of a callee 
and a caller obtained by the video synthesizer 368 is displayed. 
While the video of the callee is displayed as a main window 
and the video of the caller is displayed as a sub window in 
a Picture-in-Picture fashion in this example, the videos may 
appear in opposite windows. Or, these videos may be displayed 
in equal size. Fig. 12(d) shows the screen of a second 
interpreter terminal, on which a synthesized video of a caller 
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and a callee obtained by the video synthesizer 3 88 is displayed. 
While the video of the caller is displayed as a main window 
and the video of the callee is displayed as a sub window in 
a Picture- in-Picture fashion in this example, the videos may 
appear in opposite windows. Or, these videos may be displayed 
in equal size. 

To the audio input of the caller terminal audio CODEC 
326, an audio synthesizer 330 for synthesizing the audio output 
of the callee terminal audio CODEC 346 and the audio output 
of the first interpreter terminal audio CODEC 366 is connected. 
To the audio input of the cllcc callee terminal audio CODEC 346, 
an audio synthesizer 350 for synthesizing the audio output of 
the caller terminal audio CODEC 326 and the audio output of 
the second interpreter terminal audio CODEC 3 86 is connected. 

To the audio input of the first interpreter terminal audio 
CODEC 366, the audio output of the callee terminal audio CODEC 
346 is connected. To the audio input of the second interpreter 
terminal audio CODEC 3 86 , the audio output of the caller terminal 
audio CODEC 326 is connected. 

With this configuration, the audio of the first 
interpreter is transmitted only to the caller, and the audio 
of the second interpreter is transmitted only to the callee. 
Thus, the speech of the caller is not disturbed by the audio 
of the second interpreter^ and the speech of the callee is not 
disturbed by the audio of the first interpreter, thereby 
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providing a omoot h an effective conversation. 

The caller terminal audio synthesizer 33 0 is equipped 
with a function to suppress the audio level from the callee 
terminal when the audio from the first interpreter terminal 
is detected, and the callee terminal audio synthesizer 350 is 
equipped with a function to suppress the audio level from the 
caller terminal when the audio from the second interpreter 
terminal is detected. This prevents overlapping of the audio 
of the first interpreter or the second interpreter over the 
audio of the opponent party which causca difficulty i n hinders 
listening. The first interpreter and the second interpreter 
can simultaneously interpret the speech of the speaker_^ thus 
allowing a gpccdy enabling a quick and precise interpretation. 

Fig. 17 shows specific examples of the function to suppress 
the audio of the callee or caller in the audio synthesizers 
330, 350. As shown in Fig. 17, the audio output of the first 
interpreter terminal audio CODEC 366 is connected to a callee 
terminal audio signal adder 3 90 . The audio output of the second 
interpreter terminal audio CODEC 386 is connected to a callee 
terminal audio signal adder 393 . As a result, the unnecessary 
voice of the second interpreter is not transmitted to the caller 
and the unnecessary voice of the first interpreter is not 
transmitted to the callee. 

To the caller terminal audio signal adder 3 90, the audio 
output of the callee terminal audio CODEC 346 is connected via 
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an attenuator 391, which attenuates the audio from the callee 
terminal when the audio of the first interpreter is detected 
by the signal detector 3 92 , To the callee terminal audio signal 
adder 393, the audio output of the caller terminal audio CODEC 
326 is connected via an attenuator 394, which attenuates the 
audio from the caller terminal when the audio of the second 
interpreter is detected by the signal detector 3 95. The signal 
detectors 392, 395 are set to an appropriate detection level 
in order to prevent the audio of the opponent party from being 
attenuated by mistake due to ar-noise and the liko . 

In order to ensure that the caller or the callee can hear 
the audio of an interpreter immediately after the audio of the 
interpreter is detected by the signal detector 392, 395, an 
appropriate signal delay unit may be provided at the interpreter 
audio input of the audio signal adder 390, 393. 

While the audio of the opponent party is attenuated by 
the attenuator 3 91, 3 94 e esuch that the caller or callee can 
hear the original voice of the opponent party to some extent 
in the background of the audio of the first interpreter or second 
interpreter in this preferred embodiment, a switch may be used 
instead to ohut turn off the audio of the opponent party. 

Fig. 18 shows an example wbe ^in which the audio of the 
opponent party is shut — turned off when the audio of the 
interpreter is transmitted_j_ and only the audio of the interpreter 
is transmitted. As shown in Fig. 18, switches 3 97, 3 98 are 
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used instead of the audio signal adders 390, 393. When the 
audio of the interpreter is detected by the signal detectors 
392, 395, the switches 397, 398 are turned from the audio of 
the opponent party to the audio of the interpreter. The 
remaining configuration is the same as that shown in Fig. 17. 

In order to ensure that the caller or the callee can hear 
the audio of an interpreter immediately after the audio of the 
interpreter is detected by the signal detector 392, 395, an 
appropriate signal delay unit maybe provided at the interpreter 
audio input of the switch 397, 398. 

While the audio signal adder 390, 393 simply adds the 
audio of the interpreter and the audio of the opponent party 
in the — above — cxamplc this preferred embodiment , audio 
multiplexing of two signals may be employed used as well. For 
example in case , if a terminal supports a stereophonic audio, 
stereophonic synthesis is performed teon the audio of the 
opponent party as the left channel and the audio of the 
interpreter as the right channel and the resultafvferis transmitted 
to a terminal, where the receiving party selects a necessary 
audio. In this configuration, it is not necessary to provide 
an attenuator to attenuate the audio of the distant party in 
the videophone interpretation system. The receiving party 
listens to the audios while adjusting the volume balance of 
the right and left channels of a headset. 

While the first interpreter listens only to the audio 
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of the callee to perform interpretation and the second 
interpreter listens only to the audio of the caller to perform 
interpretation, a configuration io allowed whore m ay be provided 
in which the audio of the caller and the audio of the second 
interpreter may be attenuated and added to or audio multiplexed 
into the audio to be transmitted to the first interpreter, and 
also the audio of the callee and the audio of the first interpreter 
may be attenuated and added to or audio multiplexed into the 
audio to be transmitted to the second interpreter. By doing 
so, each interpreter can perform interpretation while checking 
the progress of the whol c ent ire conversation and the 
rcoponao responses of the interpretee . 

The videophone interpretation system 300 is equipped with 
includes an interpreter registration table 312 where in which 
the terminal number of a terminal used by an interpreter is 
registered and includes a controller 310 connected to each of 
the line I/Fs 320, 340, 360, 380, multiplexers/demultiplexers 
322, 342, 362, 382, video synthesizers 328 , 348, 368, 388, audio 
synthesizers 330, 350, and telop memories 332, 352, 372, 392. 
The controller 310 provides a function to connect a caller 
terminal, a callee terminal, a first interpreter terminal, and 
a second interpreter terminal by way of a function to accept 
a call from a caller terminal , a function to acquire the language 
type of the caller and the language type the a callee, a function 
toacquire the selection conditions for selecting an interpreter , 



71 



a function to extract the terminal number of the first 
interpreter and the terminal number of the second interpreter 
by referencing an interpreter registration table 312 by using 
the acquired language types and selection conditions , a function 
to call the first interpreter terminal and second interpreter 
terminal by using the terminal numbers of the interpreters 
extracted, and a function to call the callee terminal by using 
the acquired terminal number of the callee. 

Operation of the video synthesizers 328, 348, 368, 388 
and audio synthesizers 330, 350 is controlled by the controller 
310. A function is included where in which the user changes 
the video output method or audio output method by pressing a 
predetermined number button of a dial pad of each terminal. 

This is implcmcntod p rovided such that the 

multiplexer/demultiplexer 322 , 342, 362, 3 82 detects the number 
button on the dial pad of each terminal is pressed based on 
a data signal or a tone signal and signals the detection to 
the controller. This ensures flexibility in the usage of the 
system on each terminal. For example, only necessary videos 
or audios are selected and displayed/output in accordance with 
the ob j Gct obj ective , or it is possible to replace a main window 
with a sub window, or change the position of the sub window. 

To the input of the audio synthesizers 328, 348, 368, 
388, a caller terminal telop memory 332 , a callee terminal telop 
memory 352, a first interpreter terminal telop memory 372 and 
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a second interpreter terminal telop memory 3 92 are connected 
roapoctivcly . Contents of each telop memory 332, 352, 372, 
392 can be set from — by the controller 310. With this 
configuration, by setting a message to be displayed on each 
terminal to the telop memory 332, 352, 372, 392 and issuing 
a command to select a signal of the telop memory 332, 352, 372, 
392 to the audio synthesizer 328, 348, 368, 388 in the setup 
of a videophone conversation via interpretation, it is possible 
to transmit necessary messages to respective terminals to 
establish a four-way call. 

In caac if there is a term which is difficult to explain 
or a word which is difficult to pronounce in a videophone 
conversation, it is possible to register in advance the term 
in the term registration table 313 of the controller 310 in 
association with the number of the dial pad on each terminal. 
By doing so, it is possible to detect that the dial pad on each 
terminal is pressed during a videophone conversation by using 
a data signal or a tone signal on the multiplexer/demultiplexer 
3 22, 342, 362, 3 82, extract a term corresponding to the number 
of the dial pad pressed from the term registration table 313, 
generate a text telop, and set the text telop to each telop 
memory, thereby displaying the term on each terminal. This 
communicates, by way of a text telop, to the opponent party 
a term which is difficult to explain or a word which is difficult 
to pronounce , thus providing a apocdior q uicker and more precise 
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videophone conver sat ion . 

Next, the connection processing by the controller 310 
for establishing a videophone conversation via bidirectional 
simultaneous interpretation is described. 

•tH — this — case — also, — prior Prior to processing, aft 
interpreter selection information and a terminal number of a 
terminal used by each interpreter are registered in the 
interpreter registration table 312 of the controller 310 from 
an appropriate terminal (not shown) . Fig. 13 shows an example 
of registration item to be registered in the interpreter 
registration table 312 . As shown in Fig. 13, items registered 
in the interpreter registration table 312 are same as those 
registered in the interpreter registration table 112 shown in 
Fig. 3j_ except that a listening comprehension level and a 
speaking level are separately registered for a supported 
language. By doing so, it is possible to individually select 
an optimum interpreter as a first interpreter who interprets 
the language of the callee into the language of the caller or 
a second interpreter who Interprets the language of the caller 
into the language of the callee. 

Fig. 14 shows a processing flowchart of the connection 
processing by the controller 310- The videophone 
interpretation system 300 accepts an order f or aft-interpretation 
services, when the caller calls to a telephone number of the 
caller terminal line I/F . The videophone interpretation system 
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100 then calls the first interpreter terminal, second 
interpreter terminal, callee terminal, and establishes a 
connection for a bidirectional simultaneous interpretation 
service is established. 

As shown in Fig. 14, the presence of the call to the caller 
terminal line I/F 320 is detected initially (S300) . When a 
call is detected, a screen to prompt which prompts input of the 
language type of the caller^ similar to that shown in Fig. 5 (a-)-) / 
is displayed on the caller terminal (S302) . The language type 
of the caller input by the caller is acquired (S3 04) . A screen 
to prompt w hich prompts input of the language type of the callee 
similar to that shown in Fig. 5(b) is displayed on the caller 
terminal (S306) . The language type of the callee input by the 
caller is acquired (S308) . Next, a screen to prompt which 
prompts the interpreter selection conditions similar to that 
shown in Fig. 6 (a) is displayed on the caller terminal (S3 10) . 
The interpreter selection conditions input by the caller are 
acquired (S312) . In this example, the interpreter selection 
conditions are, samo — as — similar to the previous single 
interpretation, a gender, an age bracket , an area, a specialty 
and an interpretation level. The area is specified by using 
a ZIP code and an interpreter is selected start ing b eginning 
with the habitation closest to the specified area. For any 
selections , in caso if it is not necessary to specify a condition, 
^/A'^ may be selected. 
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Next, an interpreter who has a specified listening 
comprehension level of the language of the callee and a speaking 
level of the language of the caller, and whose gender, age, 
habitation and specialty satisfy the acquired selection 
conditions, with his/her availability flag being set^ is 
selected as a first interpreter referring to the interpreter 
registration table 312 (S314) . The terminal number of the 
selected interpreter is extracted and called (S316) . When a 
response is received from the first interpreter terminal (S318 ) , 
an interpreter who has a specif ied listening comprehension level 
of the language of the caller and a speaking level of the language 
of the callee, and whose gender, age, habitation and specialty 
satisfy the acquired selection conditions, with his/her 
availability flag being set is selected as a second interpreter 
referring to the interpreter registration table 312 (S320) . 
Then the terminal number of the selected interpreter is extracted 
and called (S322) . 

When a response is received from the second interpreter 
terminal (S324) , a screen to prompt input of the terminal number 
of the callee similar to that shown in Fig. 7 is displayed on 
the caller terminal (S32 6) . The terminal number of the callee 
input by the caller is extracted and called (S328) . 

When a response is received from the callee terminal (8330) , 
a videophone interpretation service via bidirectional 
simultaneous interpretation atarts b egins {S332) . 
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In caoQ lf a response is not received from the first 
interpreter terminal in S3 18, whether another candidate is 
available is determined (3334) . In caoo lf another candidate 
is available, execution returns to S314 and the procedure is 
repeated. In caoc lf another candidate is unavailable, the 
caller terminal is notified ag -of such and the call is released 
(S33 6) . In casQ lf a response is not received from the second 
interpreter terminal in S324, whether another candidate is 
available is determined {S338) . In caoc lf another candidate 
is available, execution returns to S320 and the procedure is 
repeated. In caao lf another candidate is unavailable, the 
caller terminal and the first interpreter terminal are notified 
ag-of such and the call is released (S340) . In casc lf a response 
is not received from the callee terminal in S330, the caller 
terminal, first interpreter terminal and second interpreter 
terminal are notified ae-of such and the call is released (S342) . 

While, in a step of selecting a first interpreter (S314) 
and a step of selecting a second interpreter {S320) , an 
interpreter who satisfies predetermined conditions is selected 
referring to the interpreter registration table 312 for 
simplicity in the above cxamplo, a this preferred embodiment, 
a_conf iguration is also possible where, oamc aa in which, similar 
to the first preferred embodiment, a candidate list similar 
to that shown in Fig. 6(b) is displayed and the caller selects 
an interpreter from the list. In this eas econf iguration , the 
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hourly rates (not shown) of each of the first interpreter and 
second interpreter registered in the interpreter registration 
table 312 may be extracted and displayed as a charge. This 
allows enables the user to consider the cost of the interpretation 
service before selecting an appropriate interpreter. The 
hourly rates of the interpreter may be determined from the 
interpretation level of the selected interpreter by referencing 
an accounting table which specifies the relationship between 
the interpretation level and the hourly rates. 

The controller 310 compri acq includes a timer (not shown) 
for calculating the fee of the interpretation service. The 
timer measures the time from when the connection is established 
to when it is released. G aUpon completion of an interpretation 
service, the fee is calculated from the time measured by the 
timer and the sum of the hourly rates of the first interpreter 
and the second interpreter mentioned above and registered in 
a accounting database 314, and charged to the user at a later 
time . 

While in case When the selected interpreter terminal 

does not accept the call, the caller is simply notified a^of 
such and the call is released in ^fefee — above this preferred 
embodiment . However , an interpretation reservation table to 
register a caller terminal number and a callee terminal number 
may be provided aa dsuch that the caller and the callee may bo 
are notified e aby when a later response from both the first 
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selected firot interpreter and the second selected interpreter 
to oot a accept the call, then the v i de ophone c onver s a t i on 
service begins . 

While the videophone interpretation system 300 
GompriGQs includes a line I/F, a multiplexer/demultiplexer, a 
video CODEC, an audio CODEC, a video synthesizer, an audio 
synthesizer and a controller in tfee — abev ethis preferred 
embodiment , these components need not be implemented by p rovided 
as individual hardware (H/W-) — bti-fe -) , and the function of each 
component may be implcmcntGd p rovided b y software processing 
on a computer. 

While the first interpreter terminal 32 and the second 
interpreter terminal 34, oamc ao similar to the caller terminal 
10 and the callee terminal 20, is located outside the 
interpretation center and called from the interpretation center 
over a public telephone line to provide an interpretation service 
in the abovc this preferred embodiment, the invention is not 
limited thereto but , and some or all of the interpreter terminals 
may be installed in the interpretation center g esuch that the 
interpretation services are provided from the interpretation 
center. 

In the above In this preferred embodiment , an 

interpreter can join an interpretation service anywhere he/she 
may be, as long as he/she has a terminal which can be connected 
to a public telephone line . ThuSjj_ the interpreter can provide 
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aft-interpretation ocrvicQ services by using the availability- 
flag to make efficient use of free time. This allowo enables 
efficient and atably stable operate of interpretation services 
which often have . difficulty in securing necessary 
pcrooncl . p ersonnel . 

While a video signal of the home terminal is not input 
to the video synthesizers 328, 348, 368, 388 in the 
above -described preferred embodiment, a function may be 
provided to input the video signal of the home terminal and 
synthesize and display to check the video on the terminal. 

While the video synthesizers 328, 348, 368, 388 are used 
to synthesize vidcoa video for each terminal in the above 
embodiment , — vidooo -described preferred embodiments, video 
from all terminals may be synthesized at once and the resultaHt 
may be transmitted to each terminal. In this case, as shown 
in Fig. 21(b) for example, a^-video of the caller, ar-video of 
the callee, a— video of the first interpreter and ar-video of 
the second interpreter may be displayed in a four split screen. 

While a function is provided whereby the telop memories 
332, 352, 372, 392 are provided and their outputs are added 
to the corresponding video synthesizers 328, 348, 368, 388 
respectively in order to display a text telop on each terminal 
in the gbovc this pref erred embodiment , a f unction may be provided 
whereby telop memories to store audio information are provided 
and their outputs are added to the audio synthesizers 33 0, 3 50 
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and an audio synthesizers is provided at the input of each of 
the first interpreter terminal audio CODEC 366 and the second 
interpreter terminal audio CODEC 386, and the outputs of the 
corresponding telop memories are added in order to output an 
audio message on each terminal. This makes it possible to 
provide a videophone interpretation service even in caoc if any 
of the caller, the callee, the first interpreter or the second 
interpreter is a visually impaired person. 

Finally, a recording/ reproduction function to record 
video or an^audio in a videophone interpretation service and 
reproduce the audio or video and transmit the resultafit upon 
receiving a request feyfrom the user will be described. 

Fig. 19 shows an example of a recording/reproduction 
function in the videophone interpretation system according to 
the first preferred embodiment. As shown in Fig. 19, a-video 
from the caller terminal video CODEC 124, a-video from the callee 
terminal video CODEC 144, and a video from the interpreter 
terminal video CODEC 164 are synthesizedby the video synthesizer 
116 and the resulta»fe- is transmitted to a video/audio 
recorder/player 118 . The audio output of the audio synthesizer 
13 0 to be transmitted to the caller terminal and the audio output 
of the audio synthesizer 150 to be transmitted to the callee 
terminal are audio multiplexed by an audio multiplexer 117 
aoauming in which the former a&"is the left -channel and the latter 
ag-is the right - channe 1 , and the resultaiifer is transmitted to 
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the video/audio recorder/player 118. 

The video output of the video synthesizer 116 and the 
audio output of the audio multiplexer 117 during an 
interpretation service are automatically recorded onto the 
video/audio recorder/player 118 and stored for each user based 
on a command from the controller 110 . The video and audio stored 
in the video/audio recorder/player 118 are reproduced based 
on a command from the controller 110 when the 
multiplexer/demultiplexer 122 or 142 detect a predetermined 
dial number is pressed on the caller terminal or callee terminal , 
and the reproduced video and audio are transmitted to each 
terminal via the video synthesizer 12 8 or 14 8 and the audio 
synthesizer 130 or 150 for the detected temninal . 

This allows the user to check ar-video from each terminal 
during an interpretation in a four split screen shown in Fig. 
21 (a ) and the like . — In case ) . If the user terminal is equipped 
with an audio multiplexing/demultiplexing function, aH-audio 
from each terminal can be checked, by -in the language of the 
caller in left-channel and by the language of the callee in 
right - channe 1 . The user may call the interpretation center 
at a later time and input a predetermined access code from his/her 
terminal to reproduce and check ar-video and an^audio stored 
in the video/audio recorder/player 118. 

A method for synthesizing a-video or audio to be recorded 
onto a video/audio recorder/player is not limited to the 
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above - described example but , and m ay be any method as long 
as the user can check the contents of the interpretation service . 
In order to support a caao where situation in which the user 
terminal is not equipped with the audio 
multiplexing/demultiplexing function, aji-audio transmitted to 
the caller and an audio transmitted to the callee may be 
individually recorded and the audio specified by a terminal 
may be reproduced and transmitted. 

The user may be a person other than the person who has 
obtained the interpretation service. When a person granted 
afr-access right has called the interpretation center from a 
videophone terminal and input an access code, he/ she may receive 
ar-video and aii-audio stored in the video/audio recorder/player 
118. 

Fig. 20 shows an example of a recording/reproduction 
function in the videophone interpretation system with 
bidirectional simultaneous interpretation according to the 
third embodiment . As shown in Fig. 20, a video from the caller 
terminal video CODEC 24 , a video from the callee terminal video 
CODEC 344, a video from the first interpreter terminal video 
CODEC 364, and a video from the second interpreter terminal 
video CODEC 384 are synthesized by the video synthesizer 316 
and the resultaHt is transmitted to a video/audio 
recorder/player 318 . The audio output of the audio synthesizer 
33 0 to be transmitted to the caller terminal and the audio output 
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of the audio synthesizer 350 to be transmitted to the callee 
terminal are audio multiplexed by an audio multiplexer 317 
gaouming such that the former a ^is the left-channel and the 
latter as-is the right -channel, and the resultaafe-is transmitted 
to the video/audio recorder/player 318. 

The video output of the video synthesizer 316 and the 
audio output of the audio multiplexer 317 during an 
interpretation service are automatically recorded onto the 
video/audio recorder/player 318 and stored for each user based 
on a command from the controller 310 . The video and audio stored 
in the video/audio recorder/player 318 are reproduced based 
on a command from the controller 310 when the 
multiplexer/demultiplexer 322 or 342 dotcct detects a 
predetermined dial number is pressed on the caller terminal 
or callee terminal is detected_, and the reproduced video and 
audio are transmitted to each terminal via the video synthesizer 
328 or 348 and the audio synthesizer 330 or 350 for the detected 
terminal . 

This allows the user to check a-video from each terminal 
during an interpretation in a four split screen shown in Fig. 
21 (b ) and the like . — In caaQ ) . If the user terminal is equipped 
with an audio multiplexing/demultiplexing function, a&-audio 
from each terminal can be checked, byin the language of the 
caller in left-channel and feyin the language of the callee in 
right-channel . The user may call the interpretation center 
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at a later time and input a predetermined access code from his/her 
terminal to reproduce and check a video and an audio stored 
in the video/audio recorder/player 318. 

A method for synthesizing a video or audio to be recorded 
onto a video/audio recorder/player is not limited to the 
above -de scribed example but , and m ay be any method as long 
as the user can check the contents of the interpretation service . 
In order to support a caoo where situation in which the user 
terminal is not equipped with the audio 
multiplexing/demultiplexing function, an audio transmitted to 
the caller and an audio transmitted to the callee may be 
individually recorded and the audio specified by a terminal 
may be reproduced and transmitted. 

The user may be a person other than the person who has 
obtained the interpretation service. When a person granted 
aHr-access right has called the interpretation center from a 
videophone terminal and input an access code, he/she may receive 
a video and an audio stored in the video/audio recorder/player 
318 . 

Induotrial Applicability 

As mentioned above, the videophone interpretation system 
or videophone interpretation method of the invention is 
advantageous in that a caller does not have to search for an 
interpreter in advance and hoi^conduct consultation with a 
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callee^ and in that the system and the method are available 
also in an emergency, thereby minimizing the restraint time 
e #occupied by the interpreter to reduce the interpretation 
service cost. 
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